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Sir/Madam: 

Please enter the following preliminary amendments for the above-referenced application: 



IN THE SPECIFICATION: 

Please amend the following paragraphs of the specification (marked-up versions of the 
following amended paragraphs are attached hereto as Appendix A): 
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Replacement paragraph at page 3, lines 7-10 (clean version): 

One non-limiting advantage of the invention is that it presents a method for defining 
selection commands for both structured and unstructured documents. Structured documents can 
be interpreted as having structural content and textual/character content. Unstructured 
documents can only be interpreted as having textual/character content. 



Replacement paragraph at page 14, line 22 - page 15, line 2 (clean version): 

As shown in Figure 4, a selection envelope 1400 is a container for a section of a source 
document 1 100, delineated by two markers referred to as the begin marker 1200 and end marker 
1300. These markers are virtual delineators that are created only during runtime by selection 
command 1600. The begin marker 1200 defines the beginning of the selection envelope 1400 
while the end marker 1300 defines the end of the selection envelope. The selected contents 1500 
is what lies between these two markers. 



Replacement paragraph at page 16, lines 1 - 8 (clean version): 

For structured documents, a selection envelope can contain various arrangements of 
structures. As shown in Figure 5, a structured document may be represented as a hierarchical 
structure 1110, including a parent object 1111, child objects 1112, 1114, and descendants 1113, 
1 1 15 . A selection envelope 1410 made of a begin marker 1210 and end marker 1310 may 
contain any valid structural element represented by object 1 1 12 and descendants 1113. Selection 
envelopes containing structural objects place their begin markers and end markers immediate 
adjacent to the object so that they exclusively define the desired object. Just as the structure of a 
document may exist as an abstract system created by an XML processor, the begin and end 
markers are virtual objects in the document. 
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Replacement paragraph at page 16, lines 9-12 (clean version): 

For unstructured documents, a selection envelope can contain contiguous segments of 
text based on the textual representation of the document. An example of a selection envelope 
1420 with relation to an unstructured document 1 120, which begins at location 1121 and ends at 
location 1 122, is shown in Figure 6. Begin marker 1220 and end marker 1320 are positioned 
around segments of content 1520 within the document 1 120, near possible locating strings 1 130, 
1131, respectively. 

Replacement paragraph at page 16, lines 13-18 (clean version): 

More generally, a system of selection envelopes can be defined so that each successive 
selection envelope, or child envelope, is defined relative to a previously defined envelope, or 
parent envelope. As shown in Figure 7, selection envelope 1430 may be defined for source 
document 1100 via selection command 1601. Envelope 1430 may then be used to produce 
envelope 1431 via selection command 1602, envelope 1431 may the produce envelope 1432 via 
selection command 1603, and envelope 1432 may produce envelope 1433 via selection 
command 1604, thereby creating a series of nested selection envelopes 1400 having begin 
markers 1200 and end markers 1300. Selection commands are more fully explained below. 

Replacement paragraph at page 16, lines 19-24 (clean version): 

The relationship between a parent envelope and its successor, or child envelope can take 
form in one of three ways. A child selection envelope 1441 may be either nested within a parent 
selection envelope 1440, as shown in Figure 8; partially overlapping a parent selection envelope 
1440, as shown in Figure 9; or completely outside of a parent selection envelope 1440, as shown 
in Figure 10. The scope of the selection is iteratively refined until the desired content has been 
selected. 
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Replacement paragraph at page 16, lines 19 - 24 (clean version): 

Furthermore, multiple sets of selection envelopes may exist simultaneously for a given 
document when a selection command is applied. Referring to Figure 1 1, a structured document 
1 1 10 can be seen to have two selection envelopes 1410 and 1411 (e.g., having begin and end 
markers 1210, 1310 and 1211, 1311, respectively) that contain two different object structures. 
Referring to Figure 12, an unstructured document 1 125 (beginning at location 1 126 and ending 
at location 1 127) can be seen to also have two selection envelopes 1421, 1422, having begin and 
end markers 1221, 1321 and 1222, 1322, respectively. 

Replacement paragraph at page 18, lines 11-17 (clean version): 

For structured documents 1 1 10, the general relationship between selection commands 
and selection envelopes is illustrated in Figure 14. A selection command 1610 may identify an 
object structure composed of a child object 1112 and descendant objects 1113, and thus specify a 
selection envelope 1410 around the structure. For unstructured documents 1120, this general 
relationship is illustrated in Figure 15. A selection command 1620 may define the locations of 
the virtual begin marker 1220 and virtual end marker 1320 and thus, define a selection envelope 
1420. 

c 

Replacement paragraph at page 23, lines 3-7 (clean version): 

The method 2000 will be defined as follows for the following four examples. The source 
document Y is an HTML document, seen in rendered form in Figure 16 and in HTML source 
view in Figure 17. The examples will illustrate the creation of four selection envelopes Si, S2, S3, 
and s 4 that respectively identify xi, x 2 , x 3 , and X4. As described above, selection envelopes are 
functions of selection commands 'c' that are defined below. 
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Replacement paragraph at page 26, lines 16-18 (clean version): 

To further elaborate on the use of selection commands, the second table of document Y 
will be selected for use in Y\ This again illustrates the use of position or sequential index of an 
object within a parent selection envelope. 

Replacement paragraph at page 29, lines 2-7 (clean version): 

Referring to step 2001, the desired content has not yet been selected thus necessitating 
the definition of another selection envelope. For this second selection envelope, the source in 
step 2004 is document Y and X3 1 . For step 2005, selection command Ck 1 has not yet been chosen 
To determine c, the process of Figure 13B is again followed. Step 2016 dictates that either 
structural, pattern-based or any combination of commands ci, C2, or C3 can be used. 

Replacement paragraph at page 29, lines 8-9 (clean version): 

For step 2017, the first selection command is determined to be a structural selection 
command 2013, as seen in Figure 13B. Command c x is parameterized as follows: 

Replacement paragraph at page 30, lines 12-16 (clean version): 

This selection envelope example illustrates the use of a command that combines 
structural and pattern-based commands. Yet again, the process of Figure 13A is used. Step 
2004 defines the source information for envelope specification; in this case, the source is 
document Y, as shown in Figure 17. For step 2005, a selection command c k l is to be selected 
from the set of functions C defined above and then parameterized. 
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Replacement paragraph at page 30, lines 17 - page 3 L line 16 (clean version): 



In order to do this, steps 2016 and 2017 of the process in Figure 13B are used. Given that 
document Y is structured, step 2016 of the process seen in Figure 13B allows either structural, 
pattern-based or any combination of commands ci, C2, or C3 to be used. For the purposes of the 
example, the desired content x 4 , is deemed to be reliably extractable by immediately using a 
selection command C3. Command C3 combines structural and pattern-based commands using 
programmatic constructs. Thus for step 2017, both a structural/contextual selection command 
2013 and a pattern-based selection command 2015 are selected. The selection command C3is 
parameterized as follows: 

type = row 
instance = 1 
string = "Rowl" 
inclusion = true 

Thus, C3 is such that 

C3 defines a resulting selection envelope, s 4 , such that: 
s 4 = f(c 3 ) 

which is equivalent to equation (5) above. Stated another way, 
s 4 = C3 (Y) = x 4 

where X4 can be seen in Figure 23. As the desired content has been selected, the answer 
for step 2001 is 'yes' and the selected content X4 is available for use in Y\ 
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■IN THE CLAIMS: 

Please amend claim 1 (a marked-up version of claim 1 is attached hereto as Appendix B) 
and add new claims 2 - 18 as follows: 

Claim 1 (amended, clean version) 

A method for extracting content from a document, comprising the steps of: 

creating at least one selection envelope based upon a plurality of selection 
commands for locating specific content within said document; and 

selecting content from said document based upon said at least one selection envelope. 
Claim 2 (new) 

The method of claim 1 wherein said selection envelope comprises a begin marker and an 
end marker, which respectively define the beginning and end of said selection envelope. 

Claim 3 (new) 

The method of claim 1 wherein said at least one selection envelope comprises a parent 
envelope and a child envelope. 

Claim 4 (new) 

The method of claim 3 wherein said child envelope is nested within said parent envelope. 
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Claim 5 (new) 

The method of claim 3 wherein said child envelope partially overlaps said parent 
envelope. 

Claim 6 (new) 

The method of claim 3 wherein said child envelope is completely outside of said parent 
envelope. 

Claim 7 (new) 

The method of claim 1 wherein said plurality of selection commands comprises a 
command based on document structure. 

Claim 8 (new) 

The method of claim 1 wherein said plurality of selection commands includes a 
command based on a character pattern. 

Claim 9 (new) 

The method of claim 1 wherein said plurality of selection commands comprises a 
combined command based on both document structure and a character pattern. 

Claim 10 (new) 

A method for extracting content from a source comprising the steps of: 
identifying said source for extracting content; 
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parameterizing at least one selection command to operate on said source; 
defining a selection envelope to select desired content from said source by use of 
said at least one selection command; 

selecting content from said source by use of said selection envelope; 
determining whether said desired content has been selected; and 
extracting said selected content if said desired content has been selected. 

Claim 1 1 (new) 

The method of claim 10 further comprising the steps of: 

defining a second selection envelope by use of at least one second selection 
command if said desired content has not been selected; 

selecting content from said source by use of said second selection envelope; 
determining whether said desired content has been selected; and 
extracting said selected content if said desired content has been selected. 

Claim 12 (new) 

The method of claim 1 1 wherein said first selection envelope comprises a parent 
envelope and said second selection envelope comprises a child envelope. 

Claim 13 (new) 

The method of claim 10 wherein said source comprises a document. 
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Claim 14 (new) 

The method of claim 10 wherein said source comprises a section of a document. 
Claim 15 (new) 

The method of claim 10 wherein said step of parameterizing said at least one selection 
command includes determining whether said source is structured or unstructured, and selecting 
said at least one selection command is based upon this determination. 

Claim 16 (new) 

The method of claim 15 wherein said at least one selection command comprises a 
structure based command selected from the group including select by name commands, select by 
location commands, select by sibling relationship commands, select by attribute commands and 
select by counter commands. 

Claim 17 (new) 

The method of claim 15 wherein said at least one selection command comprises a 
character based command selected from the group including select text contain commands and 
select text matching pattern commands. 

Claim 18 (new) 

The method of claim 15 wherein said at least one selection command comprises a 
combined structure and character based command. 
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It is respectfully asserted that the foregoing amendments place the application in 
better condition for examination, and that none of the foregoing changes contain any new 
matter. 

The Commissioner is hereby authorized to charge any additional fees which may be 
required, or credit any overpayment to Deposit Account No. 07-1896 . 



Dated: April 22, 2002 



Respectfully submitted, 

GRAY C ARY WARE & FREIDENRICH LLP 




David Alberti 
Reg. No. 43,465 
Attorney for Applicants 



GRAY C ARY WARE & FREIDENRICH 
1755 Embarcadero Road 
Palo Alto, California 94303-3340 
Telephone: (650) 833-2052 
Facsimile: (650)320-7401 



Gray Cary\EM\7 1094 19.1 
2102299-991130 



11 



Attorney Docket Number 2102299-991 130 



APPENDIX A 

Replacement paragraph at page 3, lines 7-10 (marked-up version): 

One non-limiting advantage of the invention is that it presents a method for defining 
selection commands for both structured and unstructured documents. Structured documents can 
be interpreted as having structural content and textual/character content. Unstructured 
documents can only be interpreted as having [textural] textual/ character content. 



Replacement paragraph at page 14, line 22 - page 15, line 2 (marked-up version): 

As shown in Figure 4, a selection envelope 1400 is a container for a section of a source 
document 1100, delineated by two markers referred to as the begin marker 1200 and end marker 
1300. These markers are virtual delineators that are created only during runtime by selection 
command 1600 . The begin marker 1200 defines the beginning of the selection envelope 1400 
while the end marker 1300 defines the end of the selection envelope. The selected contents 1500 
is what lies between these two markers. 



Replacement paragraph at page 16, lines 1 - 8 (marked-up version): 

For structured documents, a selection envelope can contain various arrangements of 
structures. As shown in Figure 5, a structured document may be represented as a hierarchical 
structure 1 1 1 0 , including a parent object 1111, child objects 1 1 12, 1 1 14, and descendants 1 1 13, 
1115 . A selection envelope 1410 made of a begin marker 1210 and end marker 1310 may 
contain any valid structural element represented by object 1112 and descendants 1113 . Selection 
envelopes containing structural objects place their begin markers and end markers immediate 
adjacent to the object so that they exclusively define the desired object. Just as the structure of a 
document may exist as an abstract system created by an XML processor, the begin and end 
markers are virtual objects in the document. 

12 
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Replacement paragraph at page 16. lines 9-12 f ma rked-up version): 

For unstructured documents, a selection envelope can contain contiguous segments of 
text based on the textual representation of the document. An example of a selection envelope 
1420 with relation to an unstructured document 1 120. which b egins at location 1 121 and ends at 
location 1122. is shown in Figure 6. Begin marker 1220 and end marker 1320 are positioned 
around segments of content 1520 within the document 1 120. near pos sible locating strings 1 130, 
1131. respectively . 

Replacement paragraph at page 16. lines 13-18 (marke d-up version): 

More generally, a system of selection envelopes can be defined so that each successive 
selection envelope, or child envelope, is defined relative to a previously defined envelope, or 
parent envelope. As shown in Figure 7, selection envelope 1430 may be defined for source 
document 110 0 via selection command 1601 . Envelope 1430 may men be used to produce 
envelope 1431 via selection command 1602, envelope 1431 mav the pro duce envelope 1432 via 
selection command 1603. and envelope 1432 mav produce envelope 14 33 via selection 
command 1604. thereby creating a series of nested selectio n envelopes 1400 having begin 
markers 1200 and end markers 1300. [and so on.] Selection commands are more fully explained 
below. 

Replacement paragraph at page 16. lines 19-24 (marked-up version): 

The relationship between a parent envelope and its successor, or child envelope can take 
form in one of three ways. A child selection envelope 1441 may be either nested within a parent 
selection envelope 1440, as shown in Figure 8; partially overlapping a parent selection envelope 
1440 . as shown in Figure 9; or completely outside of a parent selection envelope 1440. as shown 
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in Figure 10. The scope of the selection is iteratively refined until the desired content has been 
selected. 

Replacement paragraph at page 16. lines 19-24 (mark ed-up version): 

Furthermore, multiple sets of selection envelopes may exist simultaneously for a given 
document when a selection command is applied. Referring to Figure 11, a structured document 
1110 can be seen to have two selection envelopes 1410 and 141 1 (e.g.. having begin and end 
markers 1210. 1310 and 121 1. 1311. respectively) that contain two different object structures. 
Referring to Figure 12, an unstructured document 1 125 beginni ng at location 1 126 and ending 
at location 1127) can be seen to also have two selection envelopes 1421, 1422, having begin and 
end markers 1221. 1321 and 1222. 1322. respectively . 

Replacement paragraph at page 18. lines 11-17 (m arked-up version): 

For structured documents 1110. the general relationship between selection commands 
and selection envelopes is illustrated in Figure 14. A selection command 1610 may identify an 
object structure composed of a child object 1 1 12 and descendant objects 1 1 1 3, and thus specify a 
selection envelope 1410 around the structure. For unstructured documents 1120. this general 
relationship is illustrated in Figure 15. A selection command 1620 may define the locations of 
the virtual begin marker 1220 and virtual end marker 1320 and thus, define a selection envelope 
1420. 

Replacement paragraph at page 23. lines 3-7 fmarke d-up version): 

The method [1000] 2000 will be defined as follows for the following four examples. The 

source document Y is an HTML document, seen in rendered form in Figure 16 and in HTML 

source view in Figure 17. The examples will illustrate the creation of four selection envelopes 

si, s 2 , s 3 , and s 4 that respectively identify x u x 2 , x 3 , and X4. As described above, selection 

envelopes are functions of selection commands 'c' that are defined below. 
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Replacement paragraph at page 26. lines 16-18 (m arked-up version): 

To further elaborate on the use of selection commands, the second table of document Y 
will be selected for use in Y\ This again illustrates the use of position or sequential index [or] of 
an object within a parent selection envelope. 

Replacement paragraph at page 29. line s 2-7 (marked-up version'): 

Referring to step 2001, the desired content has not yet been selected thus necessitating 
the definition of another selection envelope. For this second selection envelope, the source in 
step 2004 is document Y and x 3 \ For step 2005, selection command c^ has not yet been chosen. 
To determine c, the process of Figure 13B is again followed. Step 2016 dictates that either 
structural, pattern-based or any combination of commands Ci, c 2 , or C3 can be used. 

Replacement paragraph at page 29. lines 8-9 (mar ked-up version): 

For step 2017, the first selection command is determined to be a structural selection 
command 2013, as seen in Figure 13B. Command ci is parameterized as follows: 

Replacement paragraph at page 30. lines 12 - 16 (marked-up version): 

This selection envelope example illustrates the use of a command that combines 
structural and pattern-based commands. Yet again, the process of Figure 13A is used. Step 
2004 defines the source information for envelope specification; in this case, the source is 
document Y, as shown in Figure 17. For step 2005, a selection command Ck 1 is to be selected 
from the set of functions C defined above and then parameterized. 
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Replacement paragraph at page 30, lines 17 -page 31, line 16 (marked-up version): 



In order to do this, steps 2016 and 2017 of the process in Figure 13B are used. Given that 
document Y is structured, step 2016 of the process seen in Figure 13B allows either structural, 
pattern-based or any combination of commands Ci, C2, or c 3 to be used. For the purposes of the 
example, the desired content X4, is deemed to be reliably extractable by immediately using a 
selection command C3. Command C3 combines structural and pattern-based commands using 
programmatic constructs. Thus for step 2017, both a structural/contextual selection command 
2013 and a pattern-based selection command 2015 are selected. The selection command^ is 
parameterized as follows: 

type = row 
instance = 1 
string = "Rowl" 
inclusion = true 

Thus, C3 is such that 

C3 defines a resulting selection envelope, s^such that: 
s 4 = f(c 3 ) 

which is equivalent to equation (5) above. Stated another way, 
s 4 = c 3 (Y) = x 4 

where X4 can be seen in Figure 23. As the desired content has been selected, the answer 
for step 2001 is 4 yes' and the selected content X4 is available for use in Y\ 
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APPENDIX B 
Claim 1 (amended, marked-up version) 

A method for extracting content from a document, comprising the steps of: 

creating at least one selection envelope based upon a plurality of selection 
commands for locating specific content within said document; and 

selecting content from said document based upon said at least one selection 

envelope. 
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